Syllable-Based Speech Recognition for Amharic
نویسندگان
چکیده
Amharic is the Semitic language that has the second large number of speakers after Arabic (Hayward and Richard 1999). Its writing system is syllabic with Consonant-Vowel (CV) syllable structure. Amharic orthography has more or less a one to one correspondence with syllabic sounds. We have used this feature of Amharic to develop a CV syllable-based speech recognizer, using Hidden Markov Modeling (HMM), and achieved 90.43% word recognition accuracy.
منابع مشابه
Grapheme Based Dictionaries for Speech Recognition∗
This report explores the potential of grapheme as a modeling unit for acoustic modeling in Amharic, Tamil and Telugu. While the three languages are considered phonetic, Amharic has an exception where in syllables of the form /CVC/ may be orthographically represented as /CC/. Here, the context determines the identity of the vowel within the syllable. We employ a transcription correction model to...
متن کاملSyllable-based and hybrid acoustic models for Amharic speech recognition
This paper presents the results of our experiments on the use of hybrid acoustic units in speech recognition and the use of syllable and hybrid acoustic models (AM) in morphemebased speech recognition. Although hybrid AMs did not bring improvement in speech recognition performance when words are used as dictionary entries and units in a language model (LM), we observed a significant word error ...
متن کاملAnalyse des performances de modèles de langage sub-lexicale pour des langues peu-dotées à morphologie riche
Performance analysis of sub-word language modeling for under-resourced languages with rich morphology : case study on Swahili and Amharic This paper investigates the impact on ASR performance of sub-word units for two underresourced african languages with rich morphology (Amharic and Swahili). Two subword units are considered : syllable and morpheme, the latter being obtained in a supervised or...
متن کاملAnalyse des performances de modèles de langage sub-lexicale pour des langues peu-dotées à morphologie riche (Performance analysis of sub-word language modeling for under-resourced languages with rich morphology: case study on Swahili and Amharic) [in French]
Performance analysis of sub-word language modeling for under-resourced languages with rich morphology : case study on Swahili and Amharic This paper investigates the impact on ASR performance of sub-word units for two underresourced african languages with rich morphology (Amharic and Swahili). Two subword units are considered : syllable and morpheme, the latter being obtained in a supervised or...
متن کاملAutomatic speech recognition for an under-resourced language - amharic
In this paper we present the development of an Automatic Speech Recognition System (ASRS) for Amharic using limited available resources and the freely available speech toolkit (HTK). There are phonological, dialectal, orthographic and morphological features of Amharic that challenge the development of ASRSs. The problem of resource scarcity is also a hindrance to the research and development in...
متن کامل